Goto

Collaborating Authors

 transfer risk


Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher

Neural Information Processing Systems

On the other hand, recent finding on neural tangent kernel enables us to approximate a wide neural network with a linear model of the network's random features. In this paper, we theoretically analyze the knowledge distillation of a wide neural network. First we provide a transfer risk bound for the linearized model of the network. Then we propose a metric of the task's training difficulty, called data inefficiency.


Task-levelDifferentiallyPrivateMetaLearning

Neural Information Processing Systems

Specifically, meta learning takes in a collection of tasks (datasets) sampled from an unknown distribution. Each task defines a learning problem with respect to an input dataset.



Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher

Neural Information Processing Systems

On the other hand, recent finding on neural tangent kernel enables us to approximate a wide neural network with a linear model of the network's random features. In this paper, we theoretically analyze the knowledge distillation of a wide neural network. First we provide a transfer risk bound for the linearized model of the network. Then we propose a metric of the task's training difficulty, called data inefficiency.



Reviews: Learning To Learn Around A Common Mean

Neural Information Processing Systems

This paper studies how to define an algorithm that, given an increasing number of tasks sampled from a defined environment, will train on them and learn a model that will be well suited for any new task sampled from the same environment. The scenario just described corresponds to the'learning to lean' problem where a learning agent improves its learning performance with the number of tasks. Specifically in this work the focus is on the'ridge regression' family of algorithms and the environment consists in tasks that can be solved by ridge regression with models around a common mean. In other words, we need a learning algorithm that besides solving regression problems, progressively learns how to approximate the environment model mean. The transfer risk is a measure of how much the knowledge acquired over certain available tasks allow to improve future learning.


Risk of Transfer Learning and its Applications in Finance

Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu

arXiv.org Artificial Intelligence

Transfer learning is an emerging and popular paradigm for utilizing existing knowledge from previous learning tasks to improve the performance of new ones. In this paper, we propose a novel concept of transfer risk and and analyze its properties to evaluate transferability of transfer learning. We apply transfer learning techniques and this concept of transfer risk to stock return prediction and portfolio optimization problems. Numerical results demonstrate a strong correlation between transfer risk and overall transfer learning performance, where transfer risk provides a computationally efficient way to identify appropriate source tasks in transfer learning, including cross-continent, cross-sector, and cross-frequency transfer for portfolio optimization.


Transfer Learning for Portfolio Optimization

Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu

arXiv.org Artificial Intelligence

In this work, we explore the possibility of utilizing transfer learning techniques to address the financial portfolio optimization problem. We introduce a novel concept called "transfer risk", within the optimization framework of transfer learning. A series of numerical experiments are conducted from three categories: cross-continent transfer, cross-sector transfer, and cross-frequency transfer. In particular, 1. a strong correlation between the transfer risk and the overall performance of transfer learning methods is established, underscoring the significance of transfer risk as a viable indicator of "transferability"; 2. transfer risk is shown to provide a computationally efficient way to identify appropriate source tasks in transfer learning, enhancing the efficiency and effectiveness of the transfer learning approach; 3. additionally, the numerical experiments offer valuable new insights for portfolio management across these different settings.


Transformers as Algorithms: Generalization and Stability in In-context Learning

Li, Yingcong, Ildiz, M. Emrullah, Papailiopoulos, Dimitris, Oymak, Samet

arXiv.org Artificial Intelligence

In-context learning (ICL) is a type of prompting where a transformer model operates on a sequence of (input, output) examples and performs inference on-the-fly. In this work, we formalize in-context learning as an algorithm learning problem where a transformer model implicitly constructs a hypothesis function at inference-time. We first explore the statistical aspects of this abstraction through the lens of multitask learning: We obtain generalization bounds for ICL when the input prompt is (1) a sequence of i.i.d. (input, label) pairs or (2) a trajectory arising from a dynamical system. The crux of our analysis is relating the excess risk to the stability of the algorithm implemented by the transformer. We characterize when transformer/attention architecture provably obeys the stability condition and also provide empirical verification. For generalization on unseen tasks, we identify an inductive bias phenomenon in which the transfer learning risk is governed by the task complexity and the number of MTL tasks in a highly predictable manner. Finally, we provide numerical evaluations that (1) demonstrate transformers can indeed implement near-optimal algorithms on classical regression problems with i.i.d. and dynamic data, (2) provide insights on stability, and (3) verify our theoretical predictions.


Feasibility and Transferability of Transfer Learning: A Mathematical Framework

Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu

arXiv.org Artificial Intelligence

Transfer learning is an emerging and popular paradigm for utilizing existing knowledge from previous learning tasks to improve the performance of new ones. Despite its numerous empirical successes, theoretical analysis for transfer learning is limited. In this paper we build for the first time, to the best of our knowledge, a mathematical framework for the general procedure of transfer learning. Our unique reformulation of transfer learning as an optimization problem allows for the first time, analysis of its feasibility. Additionally, we propose a novel concept of transfer risk to evaluate transferability of transfer learning. Our numerical studies using the Office-31 dataset demonstrate the potential and benefits of incorporating transfer risk in the evaluation of transfer learning performance.